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Abstract 

Repetitions within a given genealogical tree provides some information about 

the degree of consanguineity of a population. They can be analyzed with 

techniques usually employed in statistical physics when dealing with fixed 

point transformations. In particular we show that the tree features strongly 

depend on the fractions of males and females in the population, and also on 

the offspring probability distribution. We check different possibilities, some 

of them relevant to human groups, and compare them with simulations. 
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One of the main problems encountered in the efforts to preserve species from extinction 
is genetic diversity. Indeed, besides environmental threats to the welfare of a species, a less 
obvious but nonetheless extremely important issue is related to the largeness of the genetic 
pool from which the genes of an individual are taken. Such a problem is related to the degree 
of consanguinity within the population: the more relatives mate among themselves, the more 
reduced is the genetic diversity of the population. There are examples in the wilderness 
of species with a relatively small genetic variety: from molecular biology it is known that 
cheetahs, for example, show a high degree of consanguinity, probably due to some bottleneck 
in the number of individuals in their population some ten thousands of years ago; in human 
societies, it is well known that high rank aristocrats in Europe kept marrying only among 
themselves. As a consequence, the appearance of a hemophiliac individual spread the genetic 
disease all over the reigning houses of Europe. This example sheds light on the relevance of 
the genetic diversity of a population for its conservation: species with a small genetic pool 
are weaker against genetic diseases. The above examples show that genetic redundancy can 
come as a consequence of a reduced population. 

In this Letter we address the same problem from a different (but we believe complemen- 
tary) standpoint: we are interested in the genealogical trees of individuals of species where 
the male-to-female ratio is not 1 as in humans (here we define this ratio taking into account 
only males and females that are sexually mature). Among such examples we can name lions, 
sea lions and some antelopes, where each successfully reproducing male mates with more 
than one female (similar arguments could also be applied to polygamic human groups). Ex- 
treme cases are insects like bees and termites, where for every reproductive female (queen) 
there are very many males. 

We measure the genetic redundancy in the gene pool of an individual by measuring the 
number of times that one of its ancestors many generations in the past appears more than 
once in its genealogical tree. Indeed, if no relatives would mate among themselves then, 
since every individual has a mother and a father, it would have 2 9 ancestors g generations in 
the past, half of them males and half of them females. Each of them would appear only once 
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in the genealogical tree of their present descendents. Going back some tens of generations 
in the past, the number of ancestors would largely exceed the population itself. The only 
way out from this paradox is to assume that relatives indeed mate among themselves. As 
a consequence some individuals appear more than once in the genealogical tree of their 
descendents (that is, more than one branch of the tree had origin from such individuals), 
thus reducing the genetic pool from which their genes are taken. 

We take a population of N individuals, and we assume that it does not change in time. 
There is a fraction f'N of males and (1 — f)N of females, ad also this fraction remains 
constant in time. Every male mates therefore, on the average, with 1/f — l females. Here 
in general we make the (politically uncorrect) assumption that the fraction of males is less 
than 1/2. Since in this model there is no difference between males and females, the opposite 
situation is obtained with a transformation / — > 1 — / (everything is symmetric with respect 
to / = 1/2). We apply and extend the same scheme as developed in U, generalizing it to 
the case of male fractions different from 1/2. 

Given an individual in the present generation, we are interested in the number of times 
its ancestors at a previous generation g appear in the genealogical tree of that individual 
(at g — 1 we find parents, at g = 2 the grandparents, and so on). We therefore define 
m r (g) (f r (g)) as the number of males (females) appearing r times at generation g in the 
genealogical tree of an individual at generation 0, the present one. 

The normalization of m r (g) and f r (g) implies that we can write 

oo oo 

Y / m r (g)Ar = fN, £ Ug)^r = (1 - f)N (1) 

r=0 r=0 

where Ar = 1 trivially (but it is useful to write it explicitly for future rescalings). Since an 
individual at generation has 2 9_1 male ancestors (not necessarily distinct) at generations 
g (and 2 9 ~ l female ancestors as well), we can also write 

oo oo 

y £rm r (g)Ar = y £rf r (g)Ar = 2^ 1 . (2) 

r=0 r=0 

We define then the probabilities connected to m r (g) and f r (g)- These are probabilities 
defined over the population at generation g. Therefore we have 
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= FM-jfffe. (3) 



Using (|3]) we rewrite ([[]) as 



and (0) as 



^M r (^)Ar = ^F r (^)Ar=l (4) 

r=0 r=0 



£rM r (<7)Ar = — , X>F r (g)Ar = (1 _ ■ ( 5 ) 
Finally we rescale r, -F r (g) and M r (g) as 

23-1 29- 1 

P M (r,g) = —M r (g), P F (r,g)= ——— F r (g) 

w Af(#) = , U7 F (#) = 2g _ x r . (6) 

With these definitions Eqs.@ become 

roc foo 

/ P M (w M ,g)dw M = Pf(w f , g)dw F = 1 (7) 

JO JO 

and Eqs.(||) become 

/•oo /*oo 

/ w M P M (wM,g)dw M = w F P F (w Fl g)dw F = 1 . (8) 

JO JO 

From (^) we see that P M (wM,g) and P F (w F ,g) can be considered true probabilities. 
Next, we can write a system of equations for w m (g) and w F (g). A male i at generation g> + 1 
in the past has a number of repetitions that is given by the number of repetitions of his 
children at generation g. Therefore 

r M ,i(# + l) = Yl r Mj(g)+ rF M (9) 

j son of i j daughter of i 

and analogously for females 

r F)i {g + l)= ]T r M>j {g)+ J2 t fM (10) 

j son of i j daughter of i 

Dividing the first equation for 2 9 ~ 1 /fN we get 



w M ,i(9 + 1) = 2 12 w M,j(g) + 

3 son of i \ 



./ 



./) 



(11) 



j daughter of i 

Dividing ([Tof) for 2 9 ~ 1 /(1 — /)iV we get the analogous equation for females. 

We assume a stable (on the average) population of N individuals divided in two parts 
whose proportions are also (on the average) stable. Therefore the number of sons (daugh- 
ters) that an individual can have has to obey well defined probability distributions. In our 
simulations we proceed backward in time, keeping the population fixed at N and the male 
proportion fixed at /. Since we assign to every individual a couple of parents at random 
in the previous generation, the corresponding son/daughter probability distributions are 
binomials distributions. More precisely, the probability that a male has k sons is 



Pram 

(*) = 

and that he has k daughters is 

/ 

Pmf(k) = 

V 



fN 
k 



JN, 



1 



1 



fN-k 



fN) 



(12) 



l-f)N 
k 



(l-f)N-k 



JN, 



fN, 



(13) 



Analogous distributions can be written for Pff(k) and Pf m (k). 

We assume that the population is very large (N — > oo) and that all the w's are inde- 
pendent (this is verified in the limit of large N). In this limit the offpring probabilities 
become 



Pmm{k, f) = Pff{k, f) = — 
Pmf(kJ) =Pfm(k, 1-f) = 



"(I-/)// fl _ f\ k 



In the case / = 1/2 we recover the distributions used in Q. 



Upon defining the generating functions 



G 9 (X) 



- XwM P M {w M ,g)dw M 



roo 

Hg(p) = / e-^P F (w F ,g)dw F 

J 



(14) 



(15) 



we find then that (ITIf) become 



Gg+l(A) = J2J2Pmm( k ) 
k=0 j=0 

oo oo 
fc=0 j=0 



G n 



G„ 



'X 



PmfU) 



H. 



X f 



2 / . 



PffU) 



" » 2 1 - / 
/' 



(16) 



where also the equation for females has been written explicitely. 



Substituing (JJ) in (PL6Q we get, after some algebra 
G g+ i(A) = exp 



W ff f A / 

/ a V21-/ 



Hg+M = exp 



1 



(17) 



These equations are clearly symmetric in / — > 1 — /, since we do not make any distinction 
between males and females apart from the male proportion /. 
Next, we analyze the stationary equations, g = oo, 



G(A) = exp 
H(n) = exp 



(18) 



The probability that a male (a female) in the past does not appear in the genealogical 
tree of a given individual in the present generation is recovered sending A, \i —>■ oo (by 
tauberian theorems, the limit X, fi — > oo corresponds to the limit Tm-,t~f = 0). Therefore, 
upon calling Gq = G(oo) and H = if (oo) we have 



Go = exp (-j + G + ^—j^H 
H = exp f — - - - + t^-tGq + 



(19) 



1-/ 1-/ 

These equations can be solved numerically and the solution is shown in FigMLeft) (the 
results of the simulations agree with this solution up to the third significative digit). 

Next, we expand (jTJ|) around the fixed point assuming that Pm{ w m) ~ G 5(wm) +w^' 
and Pf{wf) ~ H 5(wf) + w f f for Wm,wf 0, which translates, by tauberian theorems, to 



6 



G(X) = G + AmX-^- 1 , H{n) = H + A F y- 
for A,/i — > oo. Eqs. ([T8|) then become 



(20) 



Go 



2^M + ^ _|_ 



+ 



A F 




A M 




A m 




A F 





Pf+1 



^Pm-Pf 



/3m +1 



Pf~Pm 



(21) 



Eqs. (|2~ID are well defined only if /3 M — (3 F — (3, and therefore we get, after some algebra, 

2 (3+1 (H + G ) = 1 (22) 

from which we can calculate the exponent (3 as a function of /, shown in Fig.|I|(Right). 
From (|19|) it is also possible to get the analytic behavior of Hq, Gq and /3 close to / = 0: 

y/2 „i 



1 



2/, /3 



-1 + 



In 2 



(23) 



As an exemple of distributions, in Fig.||] we show M r (g) and F r (g), (§), and in the inset 
their rescaled counterpart according to (Q), for / = 1/16. The exponent /3 is negative, as 
from our analytical calculations. The delta function for r = has been omitted for scale 
reasons. 

The dependence of f3 from / shows that such an exponent is highly nonuniversal and 
that it is extremely sensitive to the explicit form of the distributions (|i4|). This becomes 
important when looking at real data. In the thirties Lotka [0] fitted the probability of a man 
to have k sons in the United States by a geometric distribution p k = b mm c^^ for k ^ 0, and 
Po = d mm , with c mm = 0.5893, d mm = 0.4825 and b mm chosen for normalization. Clearly, 
such a distribution is not a Poisson distribution as used above. Moreover it would give a 
rate of increase in the population of N g /N g+ i = 1.26. 

Since in the definition of P and w in @ depend on g, the particular value of N g can be 
explicitly incorporated in it. The left hand side of ( |i~T|) is now multiplied by N g /N g+ i. The 
probabilities for a male to be son of a male and a female to be daughter of a female will be 
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those of Lotka, and the other ones can be evaluated by maintaining the fraction of males 
and females in the population constant, which translates in the constraints: 



l-d mf _ 1-f N g l-d fm f N g 



(24) 



1 - Cmf f N g+1 1-Cf m 1-f Ng +1 

We can then rewrite (|H]) as 

N g , >_/ (l-c mm )(l-d mm )G g (X/2) \ ( (l-c mf )(l-d mf )H g (X/2) \ 

%i 9+l[ ' " \ mm 1 - c mm G g (X/2) ) { dmf + 1 - c mf H g (X/2) ) 

N 9 „ f , (, , (l-c fm )(l-d fm )GM2) \ ( Q.-c ff )Q.-d ff )H g (ji/2) \ 
N^ Ha+M ~ [ dfm + l-c /ffl W2) ) {*" + l-c ff H M 2) J (25) 

Here we examine two different cases. First, we take / = 1/2 and all cs and ds as from Lotka. 

We find that the probability Go = Hq = 0.231, different from the one obtained with Poisson 

distributions [0. Then we impose that the population size remains constant, N g = N g+ i, but 

allow for different male fractions. Moreover, for simplicity, we choose d = 1 — c for the four 

probability distributions, in such a way that they become genuine geometric distributions: 

p mm (k) = Pff(k) = l/2 k+ \ p mf (k) = /(l - f) k and p fm (k) = (1 - f)f k . The results for 

Go and Hq are shown also in Fig.|l|(Left). The exponent (3 is shown in Fig.|l](Right). Go and 

Ho approach their limit for / — > as f 1 ^ 2 . In particular, the values for / = 1/2 are clearly 

different from the ones with Poisson distributions |TJ . We find therefore that neither Go and 

Hq, nor j3 are universal, although their behavior with respect to / does not, qualitatively, 

depend on the details of the chosen offspring distribution. Actually, the relevance of the 

distribution to be used is hardly overestimated: one should take distributions obtained from 

the analysis of real data, in order to draw more detailed conclusions ||. 

The present results show that, besides bottlenecks in the population size, there may be 

other factors affecting the largeness of the genetic pool from which the genes of an individual 

are taken. Indeed, for species with a very low value of / we find that most females do not 

contribute to the genes of an individual in the present generation, whereas most males 

(who are anyway a little fraction / of the entire population) do. As an extreme case (and 

exchanging males with females), in the absence of interbreeding between different hives, a 

single bee queen gives its genes to all subsequent generations. Some genetic mutation will 
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become rapidly a genetic trait of the whole progeny. In case of bad mutations, they could 
well wipe out the whole family line. Although not dangerous per se, since bees and alike are 
extremely numerous, such a feature can make the species more sensitive to population size 
fluctuations. 

In conclusion, we have generalized and analyzed the model proposed in jl] to the real- 
istic case of species and human groups with male-to-female mating ratios different from 1. 
Our results point out that the genes of an individual are taken from a pool whose largeness 
strongly depends on the male-to-female ratio, with important consequences when the pop- 
ulation size strongly fluctuates. We are currently investigating the coupling effects between 
these different factors. Yet our results, although qualitatively of a general applicability, 
clearly show that quantitative estimates can only come when the analytical treatment is 
implemented with field data, since, as it is evident from Figs.|](Left) and |l](Right), different 
offspring probability distributions give rise to different quantitative results. This is a highly 
non- universal problem. 

We thank F. Guinea for useful comments and discussions. P. De Los Rios thanks the 
Instituto de Ciencia de Materiales in Madrid, where this work was begun, for its kind 
hospitality. This work has been partially supported by the European Network contract 
FMRXCT980183. 
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FIGURES 




FIG. 1. Left: Asymptotic fraction of males and females who do not belong to the genealog- 
ical tree of a given individual in the present generation. Circles and squares are data from 
simulations for 30 generations over a population of 20000 individuals, with (from right to left) 
/ = 1/2, 1/3, 1/5, 1/8, 1/16. Right: Exponent /3 as a function of the fraction / of males. 
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FIG. 2. Male and female repetition probabilities after 20 and 30 generations (the latters are 
marked by arrows) for a male fraction / = 1/16. In the inset we show the collapse of the rescaled 
distributions. 



11 



